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Estimation for Receivers with Partial CSI 
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Abstract — The optimality of the conventional maximum- 
likelihood sequence estimation (MLSE), also known as the Viterbi 
Algorithm (VA), relies on the assumption that the receiver has 
perfect knowledge of the channel coefficients or channel state 
information (CSI). However, in practical situations that fail the 
assumption, the MLSE method becomes suboptimal and then 
exhaustive checking is the only way to obtain the ML sequence. 
At this background, considering directly the ML criterion for 
partial CSI, we propose a two-phase low-complexity MLSE 
algorithm, in which the first phase performs the conventional 
MLSE algorithm in order to retain necessary information for 
the backward VA performed in the second phase. Simulations 
show that when the training sequence is moderately long in 
comparison with the entire data block such as 1/3 of the block, the 
proposed two-phase MLSE can approach the performance of the 
optimal exhaustive checking. In a normal case, where the training 
sequence consumes only 0.14 of the bandwidth, our proposed 
method still outperforms evidently the conventional MLSE. 



I. Introduction 

In order to combat the signal distortion due to inter- 
symbol interference in frequency-selective fading channels, a 
receiver generally needs a channel estimator and an equalizer, 
where the former estimates the channel state information 
(CSI) based on a training sequence, while the latter performs 
the detection of data using the CSI obtained by the former. 
In the literature, a commonly used equalization method is 
the Euclidean-distance-based maximum-likelihood sequence 
estimation (MLSE) [1]. This MLSE is optimal if the estimator 
can perform perfect channel estimation; however, when the 
channel estimator cannot pass perfect CSI to the equalizer, the 
system performance degrades, thereby inducing the research 
about receivers with only partial CSI. 

The detection criterion for a receiver with only partial 
CSI, usually referred to as partially coherent receiver, has 
been investigated in [2-6]. Specifically, they found that the 
ML criterion for a partial coherent receiver can actually be 
written as a weighted sum of the ML criterion assuming 
perfect CSI in the receiver and the ML criterion that assumes 
no CSI available in the receiver. Since exhaustive checking 
is the unique optimal method for performing ML sequence 
estimation for a receiver without CSI, their finding makes the 



usual Viterbi algorithm (VA) unsuitable for optimal sequence 
detection when only partial CSI is available [6]. 

For this reason, we propose in this paper a two-phase 
method to perform the sequence estimation for a partially 
coherent receiver. In short, the forward VA will be executed in 
the first phase, generating necessary information required by 
the backward VA that uses the partial-CSI ML criterion in the 
second phase. Simulation results confirm that the proposed 
two-phase method can considerably outperform the conven- 
tional MLSE over channels with only partial CSI available. 

Throughout this paper, the following notations will be used: 
For a matrix X, det|X| is its determinant; X T and X H denote its 
transpose and Hermitian transpose, respectively. Also, I will 
be used to denote the identity matrix of a proper size. 

II. System Model 

In this paper, we consider a signal b = [bi, . . . , 6jv] t 
transmitted over a frequency-selective block fading (equiva- 
lently, quasi-static fading) channel of memory order P — 1, 
For 1 < i < N and M > 0, we restrict that bi is 
the output of constant-amplitude (2 M )-PSK modulation, and 
hence \bi\ 2 = 1. Among the N components in signal b, the 
first T components are the training sequence and are assumed 
known to the receiver, while the latter (N — T) symbols are 
the data to be transmitted. The received vector y can thus be 



y = Mh + n, 



(1) 



where 
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is formed by a (T x P) submatrix B P and a ((L - T) x P) 
submatrix Mrj, which are respectively defined as 
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In (QJ, noise n is zero-mean circular symmetric complex 
Gaussian distributed with correlation matrix cr^I, and h = 
[hi, ... , hp] T denotes the channel taps that remain constant 
during an L-symbol transmission block, where L = N+P—l. 
The underlying assumptions in the system we consider are 
given below. It is assumed that perfect frame synchronization 
can be achieved, and adequate guard periods are added be- 
tween consecutive transmission blocks so that there is no inter- 
block interference. In addition, both the transmitter and the 
receiver know nothing about the channel coefficients h except 
the multipath parameter P. Notably, the training sequence 
does not have to be placed at the beginning of b, but can be 
distributed over the entire transmission block. It however has 
been shown that placing the training sequence at the beginning 
of b, together with B P Bp = TI, can minimize the variance 
of estimation error [2]. This justifies the model in ([T}, where 
B P is placed ahead of B D . The condition B p B P = TI is 
accordingly assumed following [2]. 



III. Criterion and algorithm of the proposed 

TWO-PHASE METHOD 

Based on the system model in (fl}, we can divide the 
received signal y into two parts: 

= Bph, + np 
- Mp>h + rip, 

where y P and y D are defined via y H = [y P y%\, and np 
and npt are similarly defined. Under the reasonable premise 
that T > P, the least square estimate of h, given Bp and y P , 
is equal to 

h = (B^B F )- 1 B F yp. 

Then the ML decoding criterion for a receiver with only partial 
CSI is given by [2]: 



'ML 



argmaxPr(yp|B£), h = h) 



l D h\\ 2 



ar g™ I \\Vd 

-{y D -M D h) H Q B {y D - 
+of l logdet(l + (B£B P ) 



,h) 



B D B D J 



(2) 



where 



BpBp 



5>pBp j 



-v D . 



At medium to high SNRs, the last term in (01 becomes 
negligible when it is compared with the first two terms; hence, 
a near-ML decoding criterion can be yielded as follows: 



b n 



in < | 
d y. 



argmin \\y D 



*D 



h\\ 



Vd 



>h 



B \V D 



h)\. (3) 



It is noted that the criteria for both 6ml and ^Near-ML contain 
the Euclidean distance ||yp — Bp>/i|| 2 as their first term, 
which can be easily decomposed into finite-state recursive 
expression that readily suits the need of the VA. However, 
the remaining terms in (f2) and © do not have finite-state 
recursive expressions, so the VA cannot be applied to obtain 
either b M L or 6 N eai-ML- 

At this background, we propose a two-phase method to 
perform sequence estimation for a partially coherent receiver. 
The first phase is exactly the MLSE using the Euclidean 
distance &l-t = \\Vn — Bnft,|| 2 in recursive form, i.e., 



\yt+T 



<h\ 



(4) 



where "=" denotes that the two sides are equivalent metrics 
in decoding, and yt+r and uf are respectively the (t + T)th 
component of y and the tth row of Bp>. In order to apply 
the recursive metric in (0) on a VA trellis, we reformulate the 
accumulated metric <p t as a function of the trellis state i as 
follows: 

&(*)= min {&-i0') + |ct(?.*)| 2 }, (5) 

l<j<2«( p - 1 ) 

where t and i are respectively ranged from 1 to L — T and 
from 1 to 2 M( - P ' l \ 



ct(j,i) = Vt+T - u t (j,ifh, 



(6) 



and ut(j,i) = [b t +T-p+x(j,i), • • • ,& t+T (j,£)] denotes the 
signals corresponding to the trellis branch from state j at time 
t — 1 to state i at time t. Meanwhile, two variables will be 
calculated during the execution of the first phase so that they 
can be used in the second phase, which are: 

j r/f \i) = ?7t-i(.7) + c t (j, i) ■ bt+T+iU, i) 

{p^^ = pflS) + (bt+T&i))* ■ bt+T-l&i) 

where in the above two formulas, j is the minimizer of (0, 
and t and £ are ranged from 1 to L — T and from to P — 1, 
respectively. 

In the second phase, a backward VA is performed. Since 
the simulations in [2] show that (fJJ and (0 yield almost the 
same performance, we adopt the criterion in (|3} to save the 
computational complexity. We then reexpress the criterion in 
(O into an indirect backward recursive form: 

E t (j,») = <Pt+i(j) - At(7,t) + ^(*) + |c*+i(*,j)| 2 , (7) 

where j and i are respectively the previous and current states 
that define the concerned branch in the backward trellis, 
Ct+i(i,j) is defined in ©, and except that 4>t(i) is from the 



first phase, the other two terms (i.e., (fit+i(j) and Xt(j,i)) are 
backward-recursively computed as follows. By letting 



we have 



&(«) =arg mm S t (j,i), 



¥>*(j) = <Pt+ifoU)) + \ct+i(j,&(j))\ 2 , 



(8) 



and 



p-ip-i 



**&o = E E ^o',o • (rra.o) 4%*), 



u=0 n=0 



where 8 uv (j, i) is the entry at the uth row and the v\h column 
of matrix D(j, i)~ l , 
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and 
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We end this section by summarizing our proposed two-phase 
method in an algorithmic form. 

The First Phase (Forward VA): 
Step 1-1, Initialization: 

Fori < I < P-l and fori < i < 2 M(p - 1 \ initialize 

t]£\i) = and P ( q ] (i) =0. Let O (1) = and <fo(i) = oo 

for2<i<2 M ( p ~ 1 \ 
Step 1-2. Recursion (From t = 1 tot = L — T): 

Forl<i< 2 M{p - l) and fori < £ < P, compute 

\<t>t{i) = min 1 < J < 2 M(p-i) (<k-i(j) + \c t (j,i)\ 2 ) , 
\&(») = argmin^^aMfp-D (cf>t-i(j) + \c t {j,i)\ 2 ) . 

Update 

Ul i) (i) = V^iMi)) + ct(tt(i),i)-b t+T - i (Z t (i),i), 
\pP® = Pt ( -i&«) + (b t+ T&(i),i)ybt + T-t&(i),i), 
where 

ct(j,i) =yr+t -u(j,i) E h 

andu(j,i) = [b t+T -p+i (j, i),-- • ,b t+ T(j,i)] consists of 
P symbols corresponding to the trellis branch between state 
j at time t — 1 and state i at time t. 



The Second Phase (Backward VA): 
Step 2-1. Initialization: 

Fori < £ < P-l and fori < i < 2 M(P ~ 1 >, initialize 

Ci-T+iW = ° ^4'-niB = °- LettpL-T+iO.) = 
andip L - T +i(i) = oo for 2 < i <2 M( - P ~ 1 \ 
Step 2-2. Recursion (From t = L — T down tot = 1): 
Forl<i < 2 M(p - 1 \ compute 

fSt(i,*) = (<Pt+i(j) + h+i(iJ)\ 2 )+<l>t(i) - A*(j»0. 
\b(i) = aigmm 1 < j < 2 M<.p-i) Ti t (j,i), 

where the terms involved in the above computations have 

been introduced previously. 

Update 

VtW = <Pt+i(Zt(i)) + \ct+i(i^t(i))\ 2 
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Step 2-3. Trace Back: 

Output the best state sequence [1, Si, ■ • • , sl_Tj 1]. where 
s t = £ t (st_i), and its corresponding decision symbol 
sequence. 

IV. Complexity Analysis 

The computational complexity of the proposed algorithm 
consists of the forward VA complexity Cp and backward VA 
complexity C B . Since both the forward VA and backward VA 
operate on a trellis having 2 M< - P ~ 1 * > (N — T) states and there 
are 2 M branch metric calculations for each state, these two 
complexities can be expressed as 

C F = N F -2 M ■2 M{p - 1 \N-T) 



and 



C B =N B -2 M ■2 M ^ p - 1 HN-T), 



where Np and N B are the branch metric computational com- 
plexities in forward VA and backward VA, respectively. 

By convention, the complex multiplications dominate the 
branch metric computational complexity; therefore, Np and 
N B can be approximated by the number of complex multipli- 
cations required in forward VA and backward VA, respectively. 
As a result, in forward VA, there is a P-tag filter and two 
additional complex multiplications for each branch; so, we set 
Np — P + 2. In backward VA, each branch metric calculation 
needs a P-tag filter for the calculation of (fit(-), 2P 2 complex 
multiplications for At (•,•), P 3 complex multiplications for 
8u,v(-,-) and AP complex multiplications for the remaining 
variables. We then obtain N B = 5P + 2P 2 + P 3 . The total 
computational complexity is accordingly given by 

C F + C B = (N F + N B ) ■ 2 MP (N - T) 

= {2 + 6P + 2P 2 + P 3 ) -2 MP {N-T). (9) 

The complexity is considerably more than the complexity of 
conventional MLSE, which is P ■ 2 MP (N - T). However, it 



is much smaller than the complexity of the optimal exhaustive 
checking decoder, which is 



(N F + 7V B ) • 2 



MN+l 



(10) 



We remark at the end that because the complexity of on-line 
computations of S UyV (-, •) is high for a large P, the proposed 
two-phase method may be more suitable for channels with 
small P, or for a system with pre-filters at the receiver to 
reduce the tap number of channels [7], [8]. 



V. Simulations 

For simplicity, only BPSK modulation is considered in 
simulations; thus M = 1. The channel coefficients h are zero- 
mean complex-Gaussian distributed with E[hh n ] = (1/P)I 
and P = 2. By the system model introduced in Section [II] the 
signal-to-noise power ratio per information bit is given by 



N 



(dB) = 10 log 
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We first examine our proposed two-phase method using 
a data sequence of length N = 15, in which 5 of them 
are training sequence and are equal to [— 1, — 1, — 1, 1, — 1]. 
Figure Q] then shows that the word error rate (WER) of 
our proposed two-phase method is almost the same as that 
of the exhaustive checking scheme using criterion (|3). This 
figure also indicates that our proposed two-phase method 
outperforms the conventional MLSE by about 0.8 dB. All three 
schemes estimate channel coefficients via a least square (LS) 
estimator. This result confirms that our proposed two-phase 
MLSE (designed based on criterion (|3) for complexity saving) 
can achieve the optimal performance of exhaustive checking 
when the length of the training sequence is moderately large 
(for example, 1/3) in comparison with the entire block size. 

Next, we consider a longer block of length N = 70, in 
which only 10 of them are training sequence and are equal to 
[1,1,1,1,1,— 1,1,— 1,1,— 1]. Note that the training sequence 
consumes around 10/70 = 0.14 of the bandwidth^ Figure [2] 
then shows that the proposed two-phase method still maintains 
a 0.7 dB advantage in comparison with the conventional 
MLSE with LS estimation. Because at this block length, the 
exhaustive checking method is no longer feasible, we provide 
the performance of the conventional MLSE with perfect CSI 
in this figure as a reference genie-aided performance lower 
bound. 

In Figs. [3] and |H we examine our proposed two-phase 
method over the Gauss-Markov fading channel [11], [12]. In 
this channel, the channel coefficients that are fixed within a 
data burst period are varying according to 



hi 



a ■ h t -i + \J\ - 



a 2 v t 



(11) 



1 This number is smaller than what is considered in a GSM data burst, 
where a 148-bit normal burst contains a 26-bit training sequence. 
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Fig. 1. Word error rates (WERs) of three MLSE schemes in block fading 
channels for a data burst of length 15, in which 5 of them are training 
sequence. 



for 1 < t < L, where ho and v t are independent to each other 
and are zero-mean Gaussian random vectors with covariance 
matrix (1/P)I. By ( [Til , it can be easily verified that the SNR 
per information bit remains: 



-±{dB =101og 10 ^4r^ 



= 10 login — 
at 



The data format tested in Figs.[3]and|4]is the same as that used 
in Fig. [2] An additional scheme is added in comparison with 
our proposed two-phase method, which is the MLSE with an 
adaptive least mean square (LMS) filter [9], [10]. This filter 
has been proved to be effective in tracking the time-varying 
nature of time-varying channels. Under the assumption that the 
receiver can perfectly estimate the value of a, the step size of 
the LMS filter used in our simulations is set to be vl — a 2 /2. 
Figure [3] then shows that for a = 0.9999, our two-phase 
method outperforms the other two equalization schemes. The 
simulation result under a = 0.999 also indicates similar 
performance gain of our two-phase method over the other 
two equalization schemes except that a performance floor 
appears at high SNR. We again provide the performances of 
the conventional MLSE with perfect CSI in these two figures 
as reference genie-aided performance lower bounds. 



VI. Conclusion 

After establishing the recursive expression of ML criterion 
for partially coherent receiver, we propose a two-phase MLSE 
algorithm in this paper. Simulation results show that our 
method outperforms the conventional MLSE in both quasi- 
static block fading channels and time-varying Gauss-Markov 
channels. A possible future work could be to modify our 
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- Conventional MLSE with genie-aided perfect estimation 

- Proposed two-phase MLSE with LS estimation 

- Conventional MLSE with LS estimation and LMS filter 

- Conventional MLSE with LS estimation 



6 7 8 9 

E b /N„(dB) 



10 11 12 13 14 15 




6 7 8 9 

E b /N„(dB) 



Fig. 2. Bit error rates (BERs) of three MLSE schemes in block fading 
channels for a data burst of length 70, in which 10 of them are training 
sequence. 



Fig. 4. Bit error rates (BERs) of three MLSE schemes in Gauss-Markov 
channels with a = 0.999 for a data burst of length 70, in which 10 of them 
are training sequence. 
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Fig. 3. Bit error rates (BERs) of three MLSE schemes in Gauss-Markov 
channels with a = 0.9999 for a data burst of length 70, in which 10 of them 
are training sequence. 



algorithm to provide soft-outputs so that it can iteratively co- 
work with an outer coding scheme. 
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